268 research outputs found
Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
Nonnegative matrix factorization (NMF) has become a very popular technique in
machine learning because it automatically extracts meaningful features through
a sparse and part-based representation. However, NMF has the drawback of being
highly ill-posed, that is, there typically exist many different but equivalent
factorizations. In this paper, we introduce a completely new way to obtaining
more well-posed NMF problems whose solutions are sparser. Our technique is
based on the preprocessing of the nonnegative input data matrix, and relies on
the theory of M-matrices and the geometric interpretation of NMF. This approach
provably leads to optimal and sparse solutions under the separability
assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices,
makes the number of exact factorizations finite. We illustrate the
effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure
Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices
Although nonnegative matrix factorization (NMF) is NP-hard in general, it has
been shown very recently that it is tractable under the assumption that the
input nonnegative data matrix is close to being separable (separability
requires that all columns of the input matrix belongs to the cone spanned by a
small subset of these columns). Since then, several algorithms have been
designed to handle this subclass of NMF problems. In particular, Bittorf,
Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs',
NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In
this paper, we provide a new and more general robustness analysis of their
method. In particular, we design a provably more robust variant using a
post-processing strategy which allows us to deal with duplicates and near
duplicates in the dataset.Comment: 23 pages; new numerical results; Comparison with Arora et al.;
Accepted in SIAM J. Mat. Anal. App
Generalized Separable Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a linear dimensionality technique
for nonnegative data with applications such as image analysis, text mining,
audio source separation and hyperspectral unmixing. Given a data matrix and
a factorization rank , NMF looks for a nonnegative matrix with
columns and a nonnegative matrix with rows such that .
NMF is NP-hard to solve in general. However, it can be computed efficiently
under the separability assumption which requires that the basis vectors appear
as data points, that is, that there exists an index set such that
. In this paper, we generalize the separability
assumption: We only require that for each rank-one factor for
, either for some or for
some . We refer to the corresponding problem as generalized separable NMF
(GS-NMF). We discuss some properties of GS-NMF and propose a convex
optimization model which we solve using a fast gradient method. We also propose
a heuristic algorithm inspired by the successive projection algorithm. To
verify the effectiveness of our methods, we compare them with several
state-of-the-art separable NMF algorithms on synthetic, document and image data
sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the
identifiability of the model, we have modified the first synthetic
experiment, we have clarified some aspects of the contributio
Sequential Dimensionality Reduction for Extracting Localized Features
Linear dimensionality reduction techniques are powerful tools for image
analysis as they allow the identification of important features in a data set.
In particular, nonnegative matrix factorization (NMF) has become very popular
as it is able to extract sparse, localized and easily interpretable features by
imposing an additive combination of nonnegative basis elements. Nonnegative
matrix underapproximation (NMU) is a closely related technique that has the
advantage to identify features sequentially. In this paper, we propose a
variant of NMU that is particularly well suited for image analysis as it
incorporates the spatial information, that is, it takes into account the fact
that neighboring pixels are more likely to be contained in the same features,
and favors the extraction of localized features by looking for sparse basis
elements. We show that our new approach competes favorably with comparable
state-of-the-art techniques on synthetic, facial and hyperspectral image data
sets.Comment: 24 pages, 12 figures. New numerical experiments on synthetic data
sets, discussion about the convergenc
A Fast Gradient Method for Nonnegative Sparse Regression with Self Dictionary
A nonnegative matrix factorization (NMF) can be computed efficiently under
the separability assumption, which asserts that all the columns of the given
input data matrix belong to the cone generated by a (small) subset of them. The
provably most robust methods to identify these conic basis columns are based on
nonnegative sparse regression and self dictionaries, and require the solution
of large-scale convex optimization problems. In this paper we study a
particular nonnegative sparse regression model with self dictionary. As opposed
to previously proposed models, this model yields a smooth optimization problem
where the sparsity is enforced through linear constraints. We show that the
Euclidean projection on the polyhedron defined by these constraints can be
computed efficiently, and propose a fast gradient method to solve our model. We
compare our algorithm with several state-of-the-art methods on synthetic data
sets and real-world hyperspectral images
- …